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ABSTRACT 

In  the  context  of  random  variate  generation  on  digital 
computers,  the  use  of  piece-wise  linear  majorizing  functions 
in  conjunction  with  the  general  rejection  algorithm  is  pro- 
posed. Based  on  previous  results  obtained  in  the  generation 
of  beta  variates,  the  expected  advantages  and  disadvantages 
of  applying  the  concept  to  other  distributions  are  discussed, 
as  is  the  use  of  minorizing  functions  for  fast  acceptance  of 
values.  Areas  of  potential  application  are  also  discussed. 
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I . INTRODUCTION 

Much  literature  in  the  last  twenty  years 
has  been  devoted  to  process  generation  on 
digital  computers.  Process  generation  is  the 
creation  of  a sequence  of  observations  having 
the  properties  of  some  desired  distribution 
or  process.  Almost  always  process  generation 
is  a transformation  of  one  or  more  uniform 
(0,1)  values  to  the  desired  distribution. 
Common  methods  for  performing  the  transforma- 
tion include  using  the  inverse  distribution 
function  transformation,  rectangular  approxi- 
mation, special  properties,  composition,  and 
rejection.  (See  for  example,  [2,5,6].)  In 
the  past  most  interest  has  centered  on  the 
first  three  methods.  Recently  composition 
and  rejection  have  received  much  attention. 

In  a wide  variety  of  cases,  rejection  is 
fast,  easy  to  code,  and  requires  little 
memory . 

In  this  paper  the  use  of  piece-wise 
linear  rejection  functions  and  methods  for 
fast  acceptance  of  observations  are  discussed 
for  the  univariate  continuous  case.  The  beta 
distribution  is  used  as  an  example,  drawing 
on  the  results'  of  Schmeiser  and  Shalaby  [8]  . 
Discussion  centers  on  concepts  necessary  for 
generalizing  the  results  to  other  distribu- 
tions. The  general  rejection  algorithm  is 
discussed  in  Section  II  and  specialized  to 
the  piece-wise  linear  case  in  Section  III. 
Discussed  are  limitations  in  Section  IV,  fast 
acceptance  in  Section  V,  and  potential  appli- 
cations in  Section  VI. 

II.  THE  GENERAL  REJECTION  ALGORITHM 

In  this  section  a general  form  of  random 
variate  generation  using  rejection  is  given. 
This  general  form  is  specialized  to  the 
piece-wise  linear  special  case.  Implications 
and  applicability  of  the  piece-wise  linear 
approach  are  discussed. 

The  common  rectangular  rejection  algor- 
ithm can  be  generalized  as  follows.  Let  p(x) 
be  the  density  function  from  which  random 
variates  are  to  be  generated.  Let  t(x)  be  a 
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majoriziiig  function  of  p(x);  i . e . , t (x)^  p (x) 
for  all  X.  Corresponding  to  t (x)  is  the 
density  function  r(x)  = t(x)  / k,  where 

k = t(x)  dx.  Figure  1 illustrates  the 
j «>00 

relationship  of  p(x),  t (x)  and  r(x).  The 
general  rejection  algorithm  for  generating 
variates  from  p(x)  is 

1.  Generate  a value  x from  r(*)- 

2.  Generate  a value  u from  the 
rectangular  distribution  over  the 
interval  [0,1] . 

3.  If  u ^ p(x)  / t(x),  accept  x by 
setting  y = X.  Otherwise  go  to 
step  1. 


Proposition : The  algorithm  provides  values 
of  y from  the  distribution  having  density 
function  p(* ) • 

Proof:  Let  A denote  the  event  that  step  3 
results  in  acceptance.  In  any  given  itera- 
tion 

P(A  I x)  = p(x)  / t(x)  = p(x)  / [k  r(x)] 


Figure  1. 


The  functions  used  in  the 
general  rejection  algorithm 
for  generating  random 
variates  from  p(x). 
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Let  Y be  the  random  variable  resulting 
from  the  algorithm.  It  is  necessary  to  show 
that  P(Y  ^I)  = /j  p(x)  dx  for  any  interval  I. 

The  proof  follows  directly  from 

P(Y^I)  = P(X^I  I A) 

= P(A  I X^I)  P(X^I)  / P(A) 

^ K p(x)  dx  r 

•'■rr(x)  dx/[l/k] 

t(x)  dx 

= p(x)  dx  as  desired.  ■ 

Although  stated  in  a different  form, 
this  rejection  algorithm  is  mathematically 
equivalent  to  that  described  by  Tocher 
[11,  p.  25). 

For  a given  majorizing  function  t(x)  = 
k r(x) , k is  selected  to  be  as  small  as 
possible  while  still  maintaining  t(x)^p(x) 
for  all  X.  This  results  in  maximizing  the 
probability  P(A)  that  in  any  given  iteration 
the  value  x generated  in  step  1 will  be 
accepted  in  step  3. 

For  a given  density  function  p(x)  the 
choice  of  majorizing  function  t(x)  = k r(x) 
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plays  a central  role  in  determining  whether 
or  not  the  resulting  algorithm  is  efficient. 
The  majorizing  function  must  both  have  near- 
ly the  same  shape  as  p(x)  (thereby  resulting 
in  a small  value  of  k)  and  a density  function 
r(x)  which  is  amenable  to  variate  genera- 
tion (via  any  technique,  but  probably  not 
rej  action) . 

The  reason  for  early  disfavor  of  rejec- 
tion was  the  selection  of  a uniform  distri- 
bution for  r(x).  (In  fact,  many  textbooks 
discuss  only  this  special  case.)  The  rec- 
tangular assumption  restricts  consideration 
to  distributions  having  a finite  range  or  to 
approximations  obtained  by  truncation. 

While  such  approximations  may  be  made  as 
accurate  as  desired  in  theory,  great  ineffi- 
ciency results  from  using  a rectangular  dis- 
tribution to  model  tails  having  small  proba- 
bilities . 

In  the  last  several  years  the  use  of 
non-rectangular  rejection  regions  has  appeared 
more  frequently  in  the  literature.  The  gamma 
distribution  especially  has  been  the  topic  of 
several  papers  [1,9,10,12].  All  of  these 
papers  have  used  density  functions  r(x) 
corresponding  to  well-known  distributions. 

The  basic  results  of  these  papers  is  the  iden- 
tification of  suitable  functions  r(x)  and  the 
determination  of  k such  that  valid  and 
efficient  algorithms  result. 

III.  PIECE-WISE  LINEAR  MAJORIZING  FUNCTIONS 

While  the  use  of  common  theoretical 
distributions  for  r(x)  has  been  fruitful, 
another  approach  which  is  very  general  and 
often  easy  to  apply  is  to  use  piece-wise 
linear  majorizing  functions.  Piece-wise 
linearization  calls  for  partitioning  the 
range  of  the  random  variable  into  segments 
such  that  t(x)  is  linear  over  each  segment. 
The  usual  rectangular  rejection  region  is  a 
special  case  corresponding  to  only  one  seg- 
ment. Another  special  case,  briefly  dis- 
cussed by  Lewis  [5,  p.  82],  is  the  use  of 
"regular  parts,"  which  is  a discontinuous 
piece-wise  linear  majorizing  function  having 
only  rectangular  segments.  More  generally, 
however,  the  linear  segments  may  lie  at  the 
angle  providing  the  best  fit  to  p(x).  As  an 
example,  Schmeiser  and  Shalaby  [8]  used  a 
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piece-wise  linear  majorizing  funccion  in  con- 
sidering rejection  methods  for  the  beta  dis- 
tribution. Figure  2 illustrates  the  algor- 
ithm for  a particular  beta  density  function. 

Step  1 of  the  algorithm  requires  genera- 
tion of  variates  from  the  density  r(x)  = 
t(x)/k.  Now  the  piece-wise  linear  r(x)  is 
composed  of  a mixture  of  rectangular,  tri- 
angular, and  trapezoidal  densities.  Note  in 
Figure  2 that  the  trapezoidal  densities  are 
each  composed  of  a rectangular  lower  density 
and  a triangular  upper  density.  Thus, 
n ^ 

r(x)  =2  a.  r. (x)  where  Z a.  = 1 and 

i=l  ^ ^ i=l  ^ 


0 < < 1 for  all  i,  i = 1,2,...,  n and  each 
r^(x)  is  either  a rectangular  or  a triangular 
density. 


Figure  2.  Rejection  algorithm  using  a 
piece-wise  linear  majorizing 
function  t(x). 

The  generation  of  variates  from  a piece- 
wise  linear  r(x)  (using  the  composition 
method)  requires  the  generation  of  a variate 
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trom  r.(x)  wich  probability  a^.  The  rectan- 
gular aensities  may  be  easily  generated  using 
X = a + (b  - a)  u where  u is  a uniform  (0,1) 
variate  and  a and  b are  the  bounds  of  the 
rectangular  density  fmction.  The  triangular 
densities  require  x = a + (b  - a)  max  (u, , U2) 
when  r^(x)  has  a positive  slope  and 

X = a + (b  - a)  min  (u^,  U2)  when  r^(x)  has  a 
negative  slope,  where  u^^  and  U2  are  indepen- 
dently generated  uniform  (0,  1)  variates. 
Since  generation  from  a piece-wise  linear  r(x) 
requires  no  exponential  level  operations, 
step  1 can  be  executed  quite  rapidly. 

In  addition  the  probability  of  accep- 
tance in  step  3 can  be  made  close  to  one, 
since  a piece-wise  linear  majorizing  function 
can  be  made  to  fit  any  density  function  p(x) 
arbitrarily  well  by  increasing  the  number  of 
segments.  Here  a trade-off  developes  between 
few  segments  resulting  in  simple  coding  (with 
associated  minimal  memory  requirements)  and 
many  segments  resulting  in  longer  code,  more 
memory  requirements,  but  higher  probability 
of  acceptance.  The  use  of  even  a few  seg- 
ments provides  a considerably  better  fit  than 
the  simple  rectangular  region.  For  example, 
in  reference  [8]  three  and  five  segments  are 
used  in  two  of  the  beta  generation  algorithms. 
While  the  single  segment  provides  an  algor- 
ithm which  is  not  competitive  for  many  beta 
parameter  values,  five  segments  are  the  nu- 
cleus of  the  fastest  algorithm  available  for 
many  parameter  values. 

IV.  LIMITATIONS 

The  applicability  of  the  rejection  tech- 
nique is  dependent  only  upon  the  selection  of 
the  majorizing  function.  For  a particular 
density  p(x),  the  minimal  value  of  k such  that 
k r(x)  > p(x)  for  all  x is  central  to  the 
applicaFility  of  the  rejection  technique. 

This  inequality  implies  two  conditions : 

1)  k r(x)  must  be  greater  than  zero  whenever 
p(x)  is  greater  than  zero  and 

2)  lim  k r(x)  = “>  whenever  lim  p(x)  = » . 

x-^a  x->-a 

Since  k must  be  greater  than  one  and 
finite  these  two  conditions  become  1) 
r(x)  must  be  greater  than  zero  whenever  p(x) 
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is  greater  than  zero  and  2)  lim  r(x)  = “ when- 

x-»-a 

ever  lim  p(x)  = “ . Since  the  piece-wise 

x-^a 

linear  majorizing  function  cannot  be  non-zero 
at  infinity  nor  be  infinite,  the  piece-wise 
approach  is  applicable  only  as  an  approxim-a- 
tion  to  distributions  having  densities  with 
one  or  more  infinite  values  and  to  distribu- 
tions with  infinite  length  ranges. 

These  are  two  theoretically  important 
restrictions.  For  example,  the  beta  distri- 
bution with  parameters  less  than  one,  the 
gamma  distribution  with  shape  parameter  less 
than  one,  and  some  members  of  a general  fami- 
ly of  distributions  of  Schmeiser  and  Deutsch 
[7]  have  points  at  which  p(x)  is  infinite. 

In  addition,  many  distributions  have  infinite 
length  ranges,  most  commonly  (-“,  »)  and 
[0,  «>)  . 

However  these  restrictions  are  not  im- 
portant in  practice.  First  consider  the 
problem  of  lim  p(x)  = " . Few  computers  have 

-8 

accuracy  beyond  10~  , nor  do  many  applica- 
tions require  more  accuracy.  If  e is  the  mini- 
mum discernable  accuracy,  then  an  approxima- 
tion using  the  finite  values  r(a  + e)  or 
r(a  - e)  rather  than  the  infinite  r(a)  will 
probably  be  quite  acceptable.  The  second 
problem  of  infinite  length  range  may  be  over- 
come by  including  so  much  of  the  range  that 
the  excluded  portion  will  never  be  missed. 

For  example,  consider  the  normal  distribution. 
While  having  a range  of  (-",  ») , the  probab- 
ility of  observing  a point  more  than  ten 
standard  deviations  from  the  mean  is  only 
1.524  X lOE-23.  Of  course,  if  this  is 
imacceptable  then  one  hundred  standard 
deviations  can  be  included  with  little 
additional  cost. 

Although  the  theoretical  limitations  are 
not  important,  the  use  of  a piece-wise  linear 
majorizing  function  does  require  that  the 
density  function  p(x)  yields  r(x)  with  a 
reasonable  amount  of  effort.  For  distribu- 
tions having  only  a single  shape  this  is  not 
important,  since  an  appropriate  r(x)  may  be 
determined  once  and  for  all.  For  example, 
see  Kinderman  and  Ramage's  [4]  normal  variate 
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generator.  However,  families  of  distribu- 
tions such  as  the  gamma  and  beta  include  mul- 
tiple shapes.  A generator  designed  to  gener- 
ate values  from  any  member  of  the  family  must 
be  able  to  quickly  determine  an  appropriate 
majorizing  function.  For  example  the  beta 
generators  of  reference  [8]  use  the  equations 
for  the  location  of  the  mode  and  points  of 
inflexion  to  locate  the  junctures  of  the 
piece-wise  linear  segments.  The  points  of 
j inflexion  are  critical  since  the  convexity 

I or  concavity  of  the  density  function  at 

i various  points  is  necessary  to  prove  that 

, indeed  k r(x)  > p(x)  for  all  x. 

j 

i V.  FAST  ACCEPTANCE 

1 

1 

In  addition  to  choosing  r(x)  such  that 
the  probability  of  acceptance  is  high, 
another  function  b(x)  may  be  chosen  to  reduce 
the  time  necessary  to  determine  whether  or 
not  u £ p(x)  / t(x)  in  step  3.  If  b(x)  is 
substantially  faster  to  evaluate  than  p(x) 
and  if  b(x)  ^ p(x)  for  all  x,  then  the 
algorithm  may  be  made  faster  by  replacing 
step  3 with 

3(a).  If  u t(x)  < b(x),  accept  x by 
setting  y = X. 

3(b).  If  u t(x)  £ p(x),  accept  x by 
! ' setting  y = X.  Otherwise  go  to 

: step  1. 

The  theory  underlying  the  algorithm  has  not 
changed,  since  step  3(a)  accepts  x only  if 
f 3(b)  would  accept  x anyway.  However,  the  use 

i of  step  3(a)  often  makes  the  evaluation  of 

t p(x)  unnecessary.  Since  density  functions 

; often  include  time  consuming  operations  such 

; as  exponential  and  gamma  functions , the 

I savings  due  to  this  minorizing  function  can 

be  substantial.  In  particular,  a piece-wise 
linear  minorizing  function  is  often  easy  to 
; determine  and  is  always  fast  to  evaluate. 

Sometime  a minorizing  function  is  not 
necessary.  If  a trapezoidal  region  is  formed 
by  t(x)  which  is  composed  of  triangular  and 
rectangular  regions,  then  the  rectangular 
region  is  often  entirely  under  p(x).  In  this 
case  no  check  is  necessary  in  step  3,  the 
' value  of  X being  accepted  automatically. 

't 

i 

:i 
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Note  also  that  the  minorizing  function 
b(x)  may  be  used  with  any  majorizing  function, 
not  just  the  piece-wise  linear  functions 
discussed  in  Sections  II  and  III. 

VI.  POTENTIAL  APPLICATIONS 

There  are  a number  of  distributions  to 
which  the  above  concepts  can  be  applied.  The 
gamma  distribution  generators  currently 
available  involve  several  logarithmic  oper- 
ations. The  use  of  piece-wise  linear  major- 
izing and  minorizing  functions  would  almost 
certainly  be  faster.  Pearson  distributions 
other  than  the  gamma  may  also  be  amenable  to 
the  piece-wise  linear  approach.  Two  families 
which  have  well-known,  but  slow,  generators, 
the  Weibull  and  lognormal,  could  also  be 
generated  using  this  approach.  The  F and  t 
distributions,  classically  generated  using 
their  relationships  to  the  normal  or  the  beta 
distributions,  could  be  more  quickly  generated 
using  the  above  techniques.  In  addition,  the 
J-shaped  beta  family  could  benefit  from  such 
techniques.  (Jbhnk's  algorithm  [3]  for  U- 
shaped  and  the  algorithms  discussed  in 
Schmeiser  and  Shalaby  [8]  for  bell-shaped  beta 
distributions  probably  preclude  much  faster 
times  using  the  above  techniques.) 

It  is  also  possible  that  the  piece-wise 
linear  techniques  can  be  applied  to  discrete 
distributions  such  as  the  Poisson  and  bino- 
mial. Current  generators  for  these  two 
distributions  require  times  proportional  to 
the  mean  of  the  Poisson  and  to  the  number  of 

trials  for  the  binomial.  The  commonly  used 
normal  approximations  could  be  avoided  by  the 
use  of  rejection  techniques. 

VII.  SUMMARY  AND  CONCLUSIONS 

The  commonly  used  rectangular  rejection 
region  has  been  generalized  and  the  most  gen- 
eral form  of  rejection  has  been  specialized  to 
the  use  of  piece-wise  linear  majorizing  func- 
tions. The  implications  of  the  piece-wise 
linear  approach  have  been  discussed  and  poten- 
tial applications  have  been  mentioned. 
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generator.  However,  families  of  distribu- 
tions such  as  the  gamma  and  beta  include  mul- 
tiple shapes.  A generator  designed  to  gener- 
ate values  from  any  member  of  the  family  must 
be  able  to  quickly  determine  an  appropriate 
majorizing  function.  For  example  the  beta 
generators  of  reference  [8]  use  the  equations 
for  the  location  of  the  mode  and  points  of 
inflexion  to  locate  the  junctures  of  the 
piece-wise  linear  segments.  The  points  of 
inflexion  are  critical  since  the  convexity 
or  concavity  of  the  density  function  at 
various  points  is  necessary  to  prove  that 
indeed  k r(x)  ^ p(x)  for  all  x. 

V.  FAST  ACCEPTANCE 

In  addition  to  choosing  r(x)  such  that 
the  probability  of  acceptance  is  high, 
another  function  b(x)  may  be  chosen  to  reduce 
the  time  necessary  to  determine  whether  or 
not  u £ p(x)  / t(x)  in  step  3.  If  b(x)  is 
substantially  faster  to  evaluate  than  p(x) 
and  if  b(x)  ^ p(x)  for  all  x,  then  the 
algorithm  may  be  made  faster  by  replacing 
step  3 with 

3(a).  If  u t(x)  < b(x),  accept  x by 
setting  y = X. 

3(b).  If  u t(x)  <_  p(x),  accept  x by 

setting  y = X.  Otherwise  go  to 
step  1. 

The  theory  underlying  the  algorithm  has  not 
changed,  since  step  3(a)  accepts  x only  if 
3(b)  would  accept  x anyway.  However,  the  use 
of  step  3(a)  often  makes  the  evaluation  of 
p(x)  unnecessary.  Since  density  functions 
often  include  time  consuming  operations  such 
as  exponential  and  gamma  functions , the 
savings  due  to  this  mincrizing  function  can 
be  substantial.  In  particular , a piece-wise 
linear  minorizing  function  is  often  easy  to 
determine  and  is  always  fast  to  evaluate. 

Sometime  a minorizing  function  is  not 
necessary.  If  a trapezoidal  region  is  formed 
by  t(x)  which  is  composed  of  triangular  and 
rectangular  regions,  then  the  rectangular 
region  is  often  entirely  under  p(x).  In  this 
case  no  check  is  necessary  in  step  3,  the 
value  of  X being  accepted  automatically. 
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Note  also  that  the  minorizing  function 
b(x)  may  be  used  with  any  majorizing  function, 
not  just  the  piece-wise  linear  functions 
discussed  in  Sections  II  and  III. 

VI.  POTENTIAL  APPLICATIONS 

There  are  a number  of  distributions  to 
which  the  above  concepts  can  be  applied.  The 
gamma  distribution  generators  currently 
available  involve  several  logarithmic  oper- 
ations. The  use  of  piece-wise  linear  major- 
izing and  minorizing  functions  would  almost 
certainly  be  faster.  Pearson  distributions 
other  than  the  gamma  may  also  be  amenable  to 
the  piece-wise  linear  approach.  Two  families 
which  have  well-known,  but  slow,  generators, 
the  Weibull  and  lognormal,  could  also  be 
generated  using  this  approach.  The  F and  t 
distributions,  classically  generated  using 
their  relationships  to  the  normal  or  the  beta 
distributions,  could  be  more  quickly  generated 
using  the  above  techniques.  In  addition,  the 
J-shaped  beta  family  could  benefit  from  such 
techniques.  (Johnk's  algorithm  [3]  for  U- 
shaped  and  the  algorithms  discussed  in 
Schmeiser  and  Shalaby  [8]  for  bell-shaped  beta 
distributions  probably  preclude  much  faster 
times  using  the  above  techniques.) 

It  is  also  possible  that  the  piece-wise 
linear  techniques  can  be  applied  to  discrete 
distributions  such  as  the  Poisson  and  bino- 
mial. Current  generators  for  these  two 
distributions  require  times  proportional  to 
the  mean  of  the  Poisson  and  to  the  number  of 

trials  for  the  binomial.  The  commonly  used 
normal  approximations  could  be  avoided  by  the 
use  of  rejection  techniques. 

VII.  SUMMARY  AND  CONCLUSIONS 

The  commonly  used  rectangular  rejection 
region  has  been  generalized  and  the  most  gen- 
eral form  of  rejection  has  been  specialized  to 
the  use  of  piece-wise  linear  majorizing  func- 
tions. The  implications  of  the  piece-wise 
linear  approach  have  been  discussed  and  poten- 
tial applications  have  been  mentioned. 
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The  rejection  algorithm  using  the  piece- 
wise  linear  majorizing  function  has  both 
advantages  and  disadvantages  compared  to  other 
rejection  methods.  The  disadvantage  is  that 
the  determination  of  the  piece-wise  linear 
majorizing  function  can  require  a non- trivial 
set-up  cost.  The  expected  advantages  are 

1.  applicability  to  a wide  variety  of 
distributions , 

2.  fast  and  easy  generation  of  x in 
step  1, 

3.  high  probability  of  acceptance  in 
step  3,  and 

4.  fast  acceptance  in  step  3(a). 
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